Semi-Supervised Collective Classification via Hybrid Label Regularization
نویسندگان
چکیده
Many classification problems involve data instances that are interlinked with each other, such as webpages connected by hyperlinks. Techniques for collective classification (CC) often increase accuracy for such data graphs, but usually require a fully-labeled training graph. In contrast, we examine how to improve the semi-supervised learning of CC models when given only a sparsely-labeled graph, a common situation. We first describe how to use novel combinations of classifiers to exploit the different characteristics of the relational features vs. the non-relational features. We also extend the ideas of label regularization to such hybrid classifiers, enabling them to leverage the unlabeled data to bias the learning process. We find that these techniques, which are efficient and easy to implement, significantly increase accuracy on three real datasets. In addition, our results explain conflicting findings from prior related studies.
منابع مشابه
A Generative Model with Network Regularization for Semi-Supervised Collective Classification
In recent years much effort has been devoted to Collective Classification (CC) techniques for predicting labels of linked instances. Given a large number of labeled data, conventional CC algorithms can make use of local labeled neighbours to increase accuracy. However, in many real-world applications, labeled data are limited and very expensive to obtain. In this situation, most of the data hav...
متن کاملPRE-PRINT (Do Not Redistribute) Simple, Robust, Scalable Semi-supervised Learning via Expectation Regularization
Although semi-supervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semi-supervised learning method for exponential family parametric models that augments the traditional conditional lab...
متن کاملREADER: Robust Semi-Supervised Multi-Label Dimension Reduction
Multi-label classification is an appealing and challenging supervised learning problem, where multiple labels, rather than a single label, are associated with an unseen test instance. To remove possible noises in labels and features of high-dimensionality, multi-label dimension reduction has attracted more and more attentions in recent years. The existing methods usually suffer from several pro...
متن کاملRegularized Semi-supervised Classification on Manifold
Semi-supervised learning gets estimated marginal distribution X P with a large number of unlabeled examples and then constrains the conditional probability ) | ( x y p with a few labeled examples. In this paper, we focus on a regularization approach for semi-supervised classification. The label information graph is first defined to keep the pairwise label relationship and can be incorporated wi...
متن کاملA Semi-supervised Method for Multimodal Classification of Consumer Videos
In large databases, the lack of labeled training data leads to major difficulties in classification. Semi-supervised algorithms are employed to suppress this problem. Video databases are the epitome for such a scenario. Fortunately, graph-based methods have shown to form promising platforms for Semi-supervised video classification. Based on multimodal characteristics of video data, different fe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012